Skip to content

Plugin refactor : full version#175

Draft
C-Achard wants to merge 280 commits intomainfrom
cy/feature-refactor-2026
Draft

Plugin refactor : full version#175
C-Achard wants to merge 280 commits intomainfrom
cy/feature-refactor-2026

Conversation

@C-Achard
Copy link
Copy Markdown
Collaborator

@C-Achard C-Achard commented Mar 19, 2026

Plugin refactor

Central PR grouping all refactor efforts into a single all-in-one merge.

While review efforts can be focused on the PRs listed below, this central branch is meant to enable centralized testing, feedback and git maneuvers if required.

Scope

The refactor involves a full reorganization and re-writing of the plugin codebase, with the aim of :

  • Avoiding unsafe data overwrites/losing user annotation work
  • Fixing, improving and expanding the original UX into a more streamlined workflow
  • Extensive testing of existing and new features/logic
  • Small dev workflow improvements (deprecations)

Not planned for this particular PR

Removed

  • Saving segmentation masks to vertices.csv
    • I could not find any meaningful reference or targeted use case in DeepLabCut, so this has been removed for now.
    • It would be easy to expand and re-add later.

Planning

Merge into this branch, in order:

Documentation

Related PRs

Adapt to the refactor and merge:

New features

Add the following features :

Optional features

Useful features that could be added in this refactor or later as needed.

  • Project manager GUI, showing current project layout and offering context-specific quick actions

Add write_config(config_path, params) to napari_deeplabcut.core.io (yaml.safe_dump) and switch callers to use it. Update _widgets to import and call core.io.write_config instead of the private _writer helper. Revise tests to import core.io, replace direct _writer._write_config calls, and enhance writer tests: add provenance helper functions (_add_source_io, _add_save_target), introduce pytest and MissingProvenanceError checks, use monkeypatch to allow overwrite confirmation in tests, and strengthen assertions that promotion merges into existing GT files and does not modify machine prediction files. Minor import reordering and test renames/refactors to reflect the new promotion/IO contract.
Add a _kind_value helper to normalize io.kind values and use it when resolving output paths. Make HDF writes more robust by calling misc.guarantee_multiindex_rows (with exceptions ignored) before merging old/new DataFrames and again after merging, ensuring stable MultiIndex types prior to sorting/writing. Strengthen parse_points_metadata to drop runtime-only/non-JSON fields (e.g., controls and header) before model validation to avoid validation errors from napari runtime objects. Update tests to import and call the _save_layers_dialog widget helper directly.
Remove manual construction of metadata['metadata']['io'] in read_hdf and rely on _attach_source_and_io to populate IO/source fields (dropping the anchor/kind/relposix assembly). Fix test imports to use napari_deeplabcut.core.metadata instead of core.sidecar. Also refresh a few FUTURE/FIXME comment timestamps for consistency.
In src/napari_deeplabcut/_tests/test_e2e.py: update the TODO date string to ISO format (2026-02-17) and remove a duplicated/older _write_keypoints_h5 helper implementation. The duplicate function body was deleted to consolidate the helper into a single definition (the new signature remains below), cleaning up test helpers and preparing them to be moved to conftest as noted in the TODO.
Add uv.lock to .gitignore under a new 'Package managers' section to prevent committing the package manager lock file. Minor formatting added for separation.
Split and standardize annotation I/O and provenance handling:

- Add core modules: discovery (artifact discovery), dataframes (harmonize_keypoint_row_index), provenance (strict IOProvenance handling), and sidecar (folder-scoped JSON sidecar helpers).
- Move single-file HDF reading into core.io.read_hdf_single and use it from the reader; reader now uses discovery to infer AnnotationKind and pass it through.
- Tighten models: IOProvenance.kind and ImageMetadata.kind are strict; code now uses AnnotationKind enum values consistently.
- Update writer to harmonize row index representations before merging (uses harmonize_keypoint_row_index) and adopt AnnotationKind checks for promotion logic.
- Remove duplicated provenance/sidecar logic from core.metadata and centralize provenance construction/attachment into core.provenance/_attach_source_and_io use-sites.
- Update widgets and tests to use AnnotationKind enums and adapt test fixtures for the refactored indexing model.

These changes improve determinism, centralize filesystem/format logic, and make provenance and promotion behavior more robust and explicit.
Remove unused imports, reorganize import ordering, and apply minor formatting/whitespace fixes across multiple modules and tests. Modernize typing usages (Tuple -> tuple) in core.dataframes, remove unused IOProvenance/other imports where not referenced, and adjust import placements (e.g. AnnotationKind/harmonize_keypoint_row_index) to improve clarity. Also apply small test signature reformatting and ensure files end with trailing newlines where appropriate. These are housekeeping changes to reduce lints and keep code style consistent.
Restructure IO and metadata responsibilities: move image/video readers, HDF readers, config read/write, and lazy image helpers into core/io; extract keypoint metadata population into new core/layers.populate_metadata; centralize provenance attachment as attach_source_and_io and use discovery/provenance helpers for kind/io construction. Update imports in _reader, _widgets and _writer to use the new core modules, and adjust pydantic model dumping/merging to use model_dump(mode="python") for safer serialization. This cleans up duplication, improves separation of concerns, and makes provenance/metadata handling deterministic and reusable.
Add the 'FA' rule to ruff's lint.select in pyproject.toml and update imports in src/napari_deeplabcut/core/io.py: move Any to typing.Any and keep Callable from collections.abc. These are stylistic/type-import changes to satisfy the linter; no runtime behavior was altered.
Refactor: consolidate IO/reader utilities into napari_deeplabcut.core.io and update all callers. Renamed internal functions to public APIs (/_load_config -> load_config, _load_superkeypoints(_diagram) -> load_superkeypoints(_diagram)), moved is_video and read_hdf implementations into core.io, and added a LayerData import to the reader. Updated widget code and tests to use the new names and adjusted test file locations/imports accordingly. Also improved core/io docstring and small provenance/config loading fixes.
Add clarifying headers, comments and small behavior improvements to IO routines.

Key changes:
- Add supported image/video constants and section headers for CONFIG, KEYPOINTS/HDF5, SUPERKEYPOINTS, IMAGES, and VIDEO to improve file organization.
- load_config: keep reader minimal and let callers handle errors.
- read_hdf_single: handle legacy single-animal column layout by inserting an "individuals" level; fall back to a default colormap when missing; collapse MultiIndex file paths to string paths; convert image keys to integer indices when numeric and otherwise deterministically encode category paths.
- load_superkeypoints*: add explanatory comments and keep asset loading helpers.
- read_images: document use of OpenCV + Dask for lazy stacks and note stable path canonicalization for UI/metadata.
- Add comment that video reading supports OpenCV with an optional PyAV fallback.

Mostly non-functional cleanup and small compatibility fixes to make annotation and image IO more robust and easier to maintain.
Centralize HDF reading in core.io by removing the duplicate read_hdf from _reader and adding read_hdf implementation to core/io.py. Update read_hdf_single to store the kind directly in meta['io']['kind'] (fixing previous .value usage). Adjust callers and tests to import IO functions from napari_deeplabcut.core.io (read_config, read_images, read_video, _lazy_imread, Video, load_superkeypoints, load_superkeypoints_diagram, is_video, read_hdf) and update widget code to use a single io namespace (import napari_deeplabcut.core.io as io) for load_config, is_video, write_config, and superkeypoint helpers. This removes duplicated logic and fixes the metadata kind assignment.
Store DLC IO provenance and runtime header inside a nested metadata payload (layer.metadata["metadata"]). Widgets now read/write authoritative points metadata from that nested dict, preserve existing header, and place save_target into metadata["metadata"] using model_dump(mode='python'). Tests updated to instantiate Video directly. core.io now uses importlib.resources to locate packaged asset files (assets/*.json and *.jpg) and raises FileNotFoundError when missing. Also includes related cleanup and validation adjustments for robust IO routing.
Move previously module-level helpers into KeypointControls as instance methods: _paste_data, _ensure_promotion_save_target, _save_layers_dialog and on_close. Adjust metadata handling to prefer top-level layer.metadata (sync PointsMetadata and save_target at top level) and wrap the viewer closeEvent to preserve original behavior. Update paste wiring and layer method bindings to use the new instance methods. In core/io, add path-based loaders and helpers (load_superkeypoints_json_from_path, load_superkeypoints_diagram_from_path) and refactor load_superkeypoints/diagram to use them with clearer errors. Update tests to call the controls' _save_layers_dialog and simplify superkeypoints tests accordingly.
Make layer metadata handling explicit across I/O, writer, widgets, and tests. _form_df now accepts layer_metadata and layer_properties (with updated internals and type hints) and write_hdf was updated to use layer_metadata throughout, including resolving output path, parsing points metadata, overwrite confirmation, and marking controls. Keypoint widgets call _form_df with named args and adjust how superkeypoint diagrams are loaded/added (load_superkeypoints_diagram now returns the image only and add_image is given a name/metadata). Tests updated to use a static assets fixture (superkeypoints_assets) that reads bundled JSON, and the diagram test assertion was fixed to check ndim correctly.
Move DLC keypoint writing and DataFrame construction into napari_deeplabcut.core.io and introduce explicit pydantic schemas for validated input (core/schemas.py). Add form_df/form_df_from_validated helpers, atomic HDF write, and a napari-compatible write_hdf implementation that returns written paths. Export name changed from write_hdf to write_hdf_napari_dlc and napari.yaml updated accordingly. Add IO provenance and AnnotationKind support to config models, normalize header/scorer handling, and rename _maybe_confirm_overwrite to maybe_confirm_overwrite (adjusting imports). Update widgets and tests to use the new API and validation flow. Overall this centralizes IO logic, adds input validation, and improves safety when writing/promoting keypoint annotations.
Route empty/placeholder napari save operations through a dedicated token so saver logic can decide actual targets. Tests and the UI now call viewer.layers.save("__dlc__.h5", ...) instead of an empty filename. write_hdf_napari_dlc treats a missing path as "__dlc__.h5" and delegates to write_hdf(None, ...), emitting an info if a real path is supplied. write_hdf signature and callers were adjusted to preserve header metadata by default (parse_points_metadata now keeps header unless explicitly dropped) and a header coercion helper was added to accept runtime header forms. Also updated napari.yaml to use the canonical "points" layer type and added a small misc comment. These changes ensure consistent routing of napari save actions to deeplabcut-specific handling and more robust metadata parsing.
Align writer with napari writer spec and tighten I/O/error handling. write_hdf now returns a list of written paths, enforces stricter provenance for MACHINE sources, builds GT fallback paths when needed, and raises on user-abort instead of returning an empty list. Added validation for empty superkeypoints JSON. Removed the legacy write_masks helper from _writer and commented out a temporary debug logger in _widgets. Tests and fixtures updated accordingly: conftest adds a logging fixture to limit DEBUG noise, tests expect list results/Path names and call the new maybe_confirm_overwrite location. Also fixed napari.yaml layer_types entry for write_masks.
Add robust DLC header handling (DLCHeaderModel.as_multiindex, normalization, and coercion) to support MultiIndex and legacy 3-level headers; propagate scorer reliably when forming DataFrames and reindex using canonical MultiIndex. Introduce utilities for harmonizing columns, aligning old/new frames, detecting keypoint conflicts, and summarizing conflicts for user messages. Harden I/O: prevent implicit writes to MACHINE sources, require valid header for writes, detect ambiguous CollectedData_*.h5 candidates and raise AmbiguousSaveError, and return/write explicit paths. Add is_machine_layer helper and improve metadata parsing/coercion (convert io/save_target kind strings to AnnotationKind). Fix writer to forward path into write_hdf and improve widget promotion logic to update metadata rather than replace it, with additional logging. Update tests to match new behaviors and fix plugin metadata/title duplication in napari.yaml. These changes reduce accidental overwrites of machine predictions and increase compatibility with legacy DLC headers.
Refactor: relocate keypoint conflict and summary utilities from misc to napari_deeplabcut.core.dataframes. Updated imports in core/io.py and ui/dialogs.py to use keypoint_conflicts and summarize_keypoint_conflicts directly. In write_hdf, use the new keypoint_conflicts and apply harmonize_keypoint_column_index to both old and new DataFrames before combining to ensure column alignment. Removed the legacy keypoint helper implementations from misc.py.
Move guarantee_multiindex_rows and merge_multiple_scorers out of misc.py into core.dataframes and refactor writer/IO logic to use them. form_df_from_validated was enhanced to deduplicate keypoint rows, sort by presence of xy, handle single-/multi-animal header levels (insert/drop individuals level as needed), reindex columns against canonical header, and validate that finite coordinates exist when the napari layer contains points (raising a RuntimeError on mismatch). Added logging to aid debugging and updated read/write paths to drop non-finite rows and harmonize indices earlier. Updated imports across widgets, io, writer, and tests; removed the old implementations from misc.py; and adjusted tests to use more robust point-selection (isfinite) and added diagnostic logs.
Replace misc.merge_multiple_scorers usages with a direct import of merge_multiple_scorers from napari_deeplabcut.core.dataframes. Updated test imports and calls in src/napari_deeplabcut/_tests/test_misc.py and replaced the call in read_hdf_single in src/napari_deeplabcut/core/io.py. Also added a FIXME comment in DLCHeaderModel.as_multiindex (src/napari_deeplabcut/config/models.py) to note potential issues when mixing 3- vs 4-level column formats. No functional changes beyond the import refactor and the added comment.
Document and relax assumptions around empty/unlabeled keypoints and the 'individuals' header level. Comment out automatic droplevel of 'individuals' in form_df_from_validated to avoid implicitly removing that header level. Expand read/write docstrings to specify that read_hdf_single produces one Points layer per file, omits unlabeled keypoints, returns only finite coordinates, and that writers must preserve finite coordinates and only write empty Points when promoted. Add notes to populate_keypoint_layer_metadata to accept empty labels/ids/likelihood and not assume at least one point. These changes are primarily clarifications and safety guards to prevent incorrect header/empty-layer handling.
Introduce NaN-stable file signature helpers and tests to ensure writer changes treat NaN (unlabeled) consistently: add file_sig, sig_equal, assert_only_these_changed_nan_safe and several unit tests; replace the previous strict equality check with the NaN-aware assertion. Convert metadata "kind" to use AnnotationKind in tests, and add validators in IOProvenance to coerce/validate 'kind' and to validate 'project_root' is a directory. Minor refactors: use module logger in tests, move DataFrame construction earlier in write_hdf to avoid duplication, tidy a widget docstring, and update napari.yaml writer layer_types. These changes improve correctness of change-detection for DLC h5 files and tighten IO metadata validation.
Introduce new test module src/napari_deeplabcut/_tests/core/test_provenance.py covering provenance helpers. Tests exercise ensure_io_provenance, normalize_provenance, build_io_provenance_dict, and resolve_provenance_path, including validation/error cases (MissingProvenanceError, UnresolvablePathError), path normalization (backslash -> posix), enum preservation in runtime dicts, root_anchor resolution, and allow_missing behavior.
Add extensive unit tests for provenance handling: ensure_io_provenance (enum handling, invalid inputs, missing relpath behavior), normalize_provenance (backslash -> posix conversion), build_io_provenance_dict (preserve enum, exclude None), and resolve_provenance_path (root_anchor vs project_root resolution, requirement checks, allow_missing behavior, and path normalization). Also reorganize and annotate the test file with section headers. Update ensure_io_provenance docstring to reflect that AnnotationKind objects or valid string values are expected at runtime.
Refactor populate_keypoint_layer_metadata to robustly handle empty or missing ids and likelihood values: convert likelihood to a numpy array, compute a safe first_id/use_id discriminator (so empty ids or missing ids don't raise), and derive face_color/text and valid flags from the normalized likelihood array. Update docstring and internal names for clarity.

Add comprehensive unit tests (src/.../core/test_layers_metadata.py) covering single- vs multi-animal color/text selection, empty labels/ids/likelihood behavior, pcutoff logic, and is_machine_layer logging. Also remove noisy debug logs and clarify a test comment in test_e2e.
Normalize labels, ids and likelihood in populate_keypoint_layer_metadata to plain lists/ndarrays so pandas.Series inputs don't trigger truthiness or dtype errors. The change converts labels/ids to lists, ensures likelihood is a numeric ndarray, and then determines single-vs-multi-animal using the normalized ids. Added tests to verify behavior with pandas.Series: single-animal Series, empty Series (defaults to label and empty id list), and multi-animal Series which uses id-based coloring/text.
C-Achard and others added 19 commits April 1, 2026 14:16
Add and improve tooltips in LayerStatusPanel to give clearer user feedback. A tooltip was added to the size label to explain it controls the global point size and that the value is saved to config.yaml as 'dotsize'. The progress tooltip wording was clarified to indicate the count is out of all possible points ("of all possible points labeled") to reduce ambiguity.
Initialize and refine UI behavior for point-size controls and progress text/tooltips. Set default disabled appearance on construction, expand set_point_size_enabled to accept a reason and update enabled/disabled tooltips and value styling, and propagate the reason when disabling. Simplify progress label text and move the detailed breakdown into a tooltip; clear tooltips when no/invalid data. Add set_invalid_points_layer to handle non-DLC keypoints layers and update callers to pass contextual disable reasons to improve UX and clarity.
Introduce _current_dlc_points_layer that verifies a selected Points layer contains valid DLC metadata (uses read_points_meta, handles exceptions/ValidationError and missing header). Update layer-status logic to: use the raw active layer for folder inference, distinguish between no active layer, non-Points layers, and invalid DLC Points layers (set_no_active_points_layer / set_invalid_points_layer), and only enable point-size/progress when the layer is a validated DLC Points layer. Also update size-change and config-commit handlers to use the new validated accessor.
Add tests for LayerStatusPanel to validate UI state updates: verify set_invalid_points_layer disables the size slider/value and updates the progress text, and verify set_no_active_points_layer disables the controls and shows the "No active keypoints layer" message. Tests use qtbot to instantiate the panel and exercise set_point_size_enabled to confirm enabling/disabling behavior.
Rename label to "Points size" and simplify enabling logic so the entire size control container is enabled/disabled (ensuring native disabled styling for both slider and label). Consolidate tooltip creation and apply it to the container, slider, and value, removing manual stylesheet tweaks. Update the disabled hint used when no active layer to reference the global points size.
Wrap the point-size slider and value in a dedicated QWidget so the whole control can have its own layout and tooltip; remove redundant docstring note. Unify and clarify the point-size tooltip, apply it to slider/label/container, and change the form label to "Point size". Initialize the disabled state with a user-facing reason. Improve the progress tooltip to include labeled/remaining percentages and a clearer breakdown. Misc wording tweaks for disabled/reason messages.
Add a QGraphicsOpacityEffect to the point size slider and toggle its opacity (1.0 when enabled, 0.35 when disabled) to visually indicate the control's disabled state. Also import QGraphicsOpacityEffect and apply a small tooltip formatting cleanup for the progress label.
Add a comprehensive suite of unit tests for napari_deeplabcut.core.config_sync: small helper functions and higher-level behaviors. Tests cover _coerce_point_size (rounding/clamping/coercion), _layer_source_path (normal, missing, and failing path access), many resolve_config_path_from_layer scenarios (points meta inference, generic fallback hints, nearest-config fallback, error handling when reading points meta or find_nearest_config fails, passing dataset/anchor candidates to inference, and image-layer inference), and load_point_size_from_config / save_point_size_to_config behaviors (missing path, load/write failures, missing keys, coercion/clamping, unchanged values, and write error handling). Also add a pathlib.Path import used in tests.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Remove debounced/pending write machinery and write point-size to config immediately when committing. Deleted the QTimer, _pending_config_point_size_write attribute, and _flush_pending_point_size_config_write method. _commit_active_points_size_to_config now calls save_point_size_to_config directly, reports success via viewer.status, and logs failures; related scheduling code in the size setter was removed. This simplifies point-size persistence and improves error handling.
Change slider behavior so moves provide a live visual preview while commits occur only on release. setTracking(False) is set on the size slider; sliderMoved is connected to _on_slider_moved_preview to update the label and emit point_size_changed for visual-only updates, while valueChanged is connected to _on_value_changed_commit to update the label and emit point_size_commit_requested for persisting the change. This improves UX by avoiding premature commits during dragging and clarifies signal semantics.
Drop strict comparisons of dlg.count.text() in two tutorial dialog tests. The assertions matching exact "Tip X|{len(dlg._tips)}" formatting were removed from test_tutorial_next_advances_to_first_tip_and_updates_position and test_tutorial_navigation_enables_and_disables_buttons to avoid brittle failures while preserving the button enable/disable and index behavior checks.
Plugin refactor [4-III]:: Centralize keybinds config, improve shortcuts window
Plugin refactor [4-IV]: Allow users to specify config.yaml location for loose labeled folders
Plugin refactor [5]: coverage pass, split code, check performance
Update tox.ini to install the 'dev' extras instead of 'testing'. This aligns the test environment with the project's development dependency group; test command and other settings remain unchanged.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@C-Achard
Copy link
Copy Markdown
Collaborator Author

C-Achard commented Apr 1, 2026

Note : current CI failures are fixed in #182

C-Achard added 7 commits April 2, 2026 10:26
Centralize and harden config.yaml discovery and scorer resolution across the plugin. Key changes:
- Prefer find_nearest_config for automatic config lookup and switch core/io to use it when loading colormap for HDF files.
- Remove legacy misc.find_project_config_path and provenance.find_config_scorer_nearby; provenance now imports find_nearest_config.
- Overhauled the save flow in KeypointControls: auto-load scorer from an auto-discovered config, present a project-config chooser when no config is found, validate unreadable/missing scorers, fall back to sidecar or manual prompt only when appropriate, and persist sidecar defaults when possible.
- Add UI dialog helpers in ui/dialogs.py: load_scorer_from_config, warn_invalid_config_for_scorer, and extend prompt_for_project_config_for_save to optionally resolve and return a scorer.
- Add module-level note in _widgets.py recommending future refactors to keep the file small.

These changes make project association and scorer selection more robust and user-friendly, add clearer warnings for invalid configs, and consolidate config lookup logic.
Introduce skip_project_config_dialog fixture and expand e2e tests to cover the new project-config save flow (auto-discovery, user-specified external config, sidecar precedence, and skip behavior). Update several test names and expectations to assert config-selection dialog behavior and scorer resolution, and add new tests for config-winning logic. Also tidy _widgets.py imports by importing find_nearest_config from project_paths (removing the duplicate import from provenance).
Extend dialog tests to cover project config and scorer resolution flows. Adds pytest import, new fake Qt helpers (_FakeMessageBox, _FakeFileDialog, _FakeButton) and a fake_config_prompt_qt fixture that monkeypatches ui_dialogs.QMessageBox and QFileDialog. Introduces tests for load_scorer_from_config (trim, missing, blank), prompt_for_project_config_for_save (skip, cancel, associate valid config, handle missing scorer, unreadable config, custom UI text) and warn_invalid_config_for_scorer. These tests exercise scorer resolution logic and error handling using tmp_path and monkeypatching.
Guard against non-dict parsed configs when loading/saving the point-size key. load_point_size_from_config now logs and returns None if the config is not a mapping; save_point_size_to_config logs and replaces a non-mapping config with an empty dict before updating the point-size value. This prevents type errors when the config file's top-level YAML value isn't a mapping.
Ensure non-mouse or programmatic size changes also update the visual layer. Previously only point_size_commit_requested was emitted (for saving/persistence), which could leave visuals out-of-sync. This adds a call to point_size_changed.emit(int(value)) in _on_value_changed_commit to trigger an immediate visual update when the value is committed.
Clarified installation instructions and added an example for creating and activating a uv virtual environment (using Python 3.12) before installing with uv. Small wording tweaks: note that pip example is shown e.g. for a conda environment, and the widget auto-open phrasing was made clearer. Added guidance about config.yaml: several plugin functions expect the config two folders up from saved CollectedData files, fallback behavior exists but you will be prompted to provide a matching config if it is missing; the plugin will then save points/metadata into the correct folder based on that config. Recommended moving the dataset into the project structure and updating paths in config.yaml for full compatibility.
Replace the tests badge URL in README.md to point to the new GitHub Actions workflow (test_and_deploy.yml) and include branch=main in the badge URL so the status reflects the correct workflow/branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to crop manually extracted frames in napari Can't add body parts to labeling GUI after extracting outlier frames

1 participant